Entry Name:  "UBA-Bayle-MC1"

VAST Challenge 2015
Mini-Challenge 1

 

 

Team Members:

Federico Bayle, University of Buenos Aires, fedebayle@gmail.com      PRIMARY
Hernan Berinsky, University of Buenos Aires, hberinsky@gmail.com
Lucas Pogorelsky, University of Buenos Aires, ldpogorelsky@gmail.com
Martin Azcona, University of Buenos Aires, azconamartin@gmail.com


Student Team:  YES

 

Did you use data from both mini-challenges?  YES

 

Analytic Tools Used:

Tableau

Excel

R (ggplot2)

Python (igraph library)

 

Approximately how many hours were spent working on this submission in total?

80 hours

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete? YES

 

 

Video:

UBA-Bayle-MC1.wmv

 

 

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

MC1.1Characterize the attendance at DinoFun World on this weekend. Describe up to twelve different types of groups at the park on this weekend. 

a.       How big is this type of group?

b.      Where does this type of group like to go in the park?

c.       How common is this type of group?

d.      What are your other observations about this type of group?

e.      What can you infer about this type of group?

f.        If you were to make one improvement to the park to better meet this group’s needs, what would it be?

Limit your response to no more than 12 images and 1000 words.

We found visitors entering and exiting the park togheter. Manual exploration over those visitors show us they are also good candidates to walking togheter. There are troops of different sizes.

Looking at those candidates we found that large groups tend to visit the park fewer days than small groups. In particular, big size groups visit the park just one a day in the weekend (there is no group with more than 15 members that visited the park more than one day).

Under these observations we define the following group categories regarding the group size:

  • Big (more than 15 members)
  • Medium (between 5 and 15 members)
  • Small (between 2 and 4 members)
  • Individual (a single member)

A visitor may belong to some category one day and other on next day, when it happen we consider the visitor belonging to his or her maximum size group across weekend (when we are considering some measure of the whole weekend). In numbers, we have found:

Big size groups (more than 15 vistors)
Day# Groups# Visitors (id's)
Friday14351
Saturday23703
Sunday23706
Median size groups (between 5 and 15 vistors)
Day# Groups# Visitors (id's)
Friday1761202
Saturday3202153
Sunday3392136
Small size groups (between 2 and 5 visitors)
Day# Groups# Visitors (id's)
Friday6351883
Saturday10683123
Sunday12923821

Here is a capture of such group at different time moments (using different colours). We can see in the left picture how they move on Sunday. They entered the park at 9:14, at 9:22 they are in Squidosaur and others at Information Assistance place, later we can see them at Grinosaurus Stage, then they move to Creighton Pavilion (Scott Jones's show) at 13:16. On afternoon, we can see them walking in Wet Land, and later in Kiddie Land. At night, they are at shoppings before they leave the park.

We notice a marked difference in the kind of attractions visited, movement patterns and places where spend their time for each group. Counting adjusted number of checkins (number of checkins over the number of visited days), we can see that Individuals do not usually checkin on Kiddie rides and they prefeer Thrill rides more than the other groups. We can infer that Individual group represents a single visitor, small family or a child with a parent using the application. Small and medium groups are similar on the average checkins on Kiddie rides and Thrill rides, but differente variance. In Big group we can see a notable difference, with lower variance on Thrill rides respect to other groups and their number of checkins to Kiddie ride attractions.

About movement patterns, we measure the activity when visitors are moving and stopped for the different groups. We consider that a visitor at any moment can be either stopped (we consider that people is stopped if movement speed is less than 10 meters in 2 minutes, otherwise is moving or checked-in in some attraction). Minutes and kilometers in Moving time, Stopped and Distance charts are adjusted by number of days visited. The first markable differences are the movement speed hike of Individual group respect to the others (more than 150 meters/minute, more than 50% in average compared with the rest of groups) and a very low variance in movement speed of the Big group. Big groups seems that are many people walking all togheter at same rate hike (same rate hike within the group and between different Big groups, and perhaps Big group outliers are the guides). The average distance traveled by different groups is very similar, but its variance is reduced as the size of the group grows. Because the average distances are similar, differences in average movement speed reflects the expected difference in Moving times.

Respect to how different groups spend their time out of ride attractions, measuring adjusted hours (hours over number of days visited to the park), we observe the following differences: Big groups tend to spend more time in Beer gardens and Food places than the other groups. Except Individual, who do not spend too much time in neither in shopping nor beer and food places, shopping time is reduced as the size of the group grows. We have no explanation why Medium and Individual group have similar behavior about Beer gardens (we think that could be interesenting to analyze this fact in more detail).

Some people, individuals or groups, visit the park just one day, others two or the whole weekend. We show the following charts to see how can differ group behavior within groups depending on the number of days visited. In particular, we observe some differences when measure average the time percentage on a day in the park that people is checked-in (enjoying some attraction). For Individual, with more days they check-in less time in attractions per day, but in the Medium size groups, people who visit the park more days tend to spend more time in attractions in those days. For Small groups this variable mantains stable. In the same way, Stopped explains that people do stops more time in Individuals as number of visited days grow, while Medium Size people that visit the park more days do not stop too much time and preefer use their time visiting different attractions.

In respect to shopping, Individuals that visit the park more days, the have more time visit to shopping. While Medium size group visit shopping less time if the visit the park more days. May be, Individuals visiting more days buy some more gifts to friends and/or family, while people in Medium size group may be visit the park with her or his friends and family.

 

 

 

 

 

MC1.2 – Are there notable differences in the patterns of activity on in the park across the three days?  Please describe the notable difference you see.

 

Limit your response to no more than 3 images and 300 words.

 

Throughout the three maps, we find that Sunday is the day with the highest attendance followed by Saturday, which is verified in darker colors for these two days. However, we note that "Creighton Pavilion" and "Grinosaurus Stage" the difference is not found if we look only check ins and not movement which it’s evidenced by the size of the circles associated with each location in the park. Nor it was verified in Auvilotops Express but in the opposite direction. The Crowding Index (a Gini index based measure) quantifies the degree of clustering of people in the park for a given moment of time. You can see that this index increases in closing or opening time where people approach the exit. So it does when there are massive shows that bring together large numbers of people, and gets lower after those shows, when people go to another attractions.

In figure 2, we show that exists differences between regions activity through the three days. Both Saturday and Sunday has an important activity in "Wet Land" and "Tundra Land" than "Coaster Alley" and "Kiddie Land" (on friday that difference is lower). On Sunday, the afternoon shows in Creighton Pavilion (Scott Jones's show) and "Grinosaurus Stage" have not activity which impact in a lower difference in Wet Land and Tundra Land activity.

Another interesting pattern that we saw through the days is the analysis of what kind of places people visits for those that visit the park two days. As shown in Figure 3, the number of check ins in "Grinosaurus Stage" increase considerably during the second days . For this analysis, the activity was normalized in Sunday, given the circumstances mentioned above. The following chart summarizes the activity of people that visit the park the three days of the weekend.

 

 

 

 

MC1.3What anomalies or unusual patterns do you see? Describe no more than 10 anomalies, and prioritize those unusual patterns that you think are most likely to be relevant to the crime.

 

Limit your response to no more than 10 images and 500 words.

 

 

We thought that the crime discovery occurs sunday around 11:40hs at the Creighton Pavilion. We are going to show the anomalies around that time which help us to confirm our previous hypothesis. Unusual amount of stops sunday at 09:40hs: we found 55 ids of users who stop at Creighton Pavilion in that time. This situation don’t match with events occurred in previous days during 09hs (1 id in friday, 6 on saturday).

Peak attendance to Creighton Pavilion one hour earlier than usual: in the following chart we show the difference in peak attendance to Creighton Pavilion, being that this happened on Sunday before 12hs instead of 13hs. Note that this peak was not as pronounced as in previous days , indicating that the morning show was cancelled. In figure 3, we check that with communications data and arrived to the same hyphothesis.

Unusual movement patterns nearly Creighton Pavilion: from these anomalies we could infer what was the pattern of movement after the discovery of the crime and subsequent interruption of activities. We select those users who checked in at 11:45hs on Creighton Pavilion and then analyze how they moved minutes later. In figure 4 we show the communication migration (Mini Challenge 2 data) from Wet Land other places after the crime discovery. In next figures, we will analyze these movements on the park map.

In figure 5, we see how people was scattered in the park, showing in each bar plot the count of ids which pass for that places, noting that some of them are unusual compared to Friday and Saturday.

Five minutes later, we can see how they divide into more groups, replanning how to continue their day at the park. In figure 6, we show the increase of movement in Jurassic Road (place 22), Tar Pit Stop (place 50) and in the vicinity of Creighton Pavilion (place 32) In figure 7 we show how looks this division later, focused in movement around Paleocarrie Carrousel (place 21) , The Magic Cavern (place 48), Flight of the Swingodon (place 81) and Theresaur Food Stop (place 35).

Unusual amount of visits in Autovilotops Express: after midday, check ins at Autovilotops Express increase significantly compared with previous days. This anomaly suggests that they have not done the shows in the evening ruled in other areas, according to our hypothesis that area is Wet Land. Average time per visitant is larger for Sunday (curve thickness) for that time. In figure 9, we illustrate that mentioned increment.

Shows scheduled on Creighton Pavillion and Grinosaurus Stage in sunday afternoon were cancelled: there is no check ins both Creighton Pavillion and Grinosaurus Stage during that time, we only find stops and movements there. This anomalies or unusual patterns show us that our hypothesis of the occurrence of the crime in Creighton Pavilion discovered around 11:40hs is correct.